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background: Proliferation has emerged as a major prognostic factor in luminal breast cancer. The immunohistochemical (IHC) 
proliferation marker Ki67 has been most extensively investigated but has not gained widespread clinical acceptance. 
methods: We have conducted a head-to-head comparison of a panel of proliferation markers, including Ki67. Our aim was to 
establish the marker of the greatest prognostic utility. Tumour samples from 3093 women with breast cancer were constructed as 
tissue microarrays. We used IHC to detect expression of mini-chromosome maintenance protein 2, Ki67, aurora kinase A (AURKA), 
polo-like kinase I , geminin and phospho-histone H3. We used a Cox proportional-hazards model to investigate the association with 
10-year breast cancer-specific sun/ival (BC55). Missing values were resolved using multiple imputation. 

results: The prognostic significance of proliferation was limited to oestrogen receptor (ER)-positive breast cancer. Aurora kinase A 
emerged as the marker of the greatest prognostic significance in a multivariate model adjusted for the standard clinical and molecular 
covariates (hazard ratio 1.3; 95% confidence inten/al 1. 1-1.5; P = 0.005), outperforming all other markers including Ki67. 
conclusion: Aurora kinase A outperforms other proliferation markers as an independent predictor of BCSS in ER-positive breast 
cancer. It has the potential for use in routine clinical practice. 
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The intrinsic molecular subtypes have become central to breast 
cancer research (Perou et a/, 2000; S0rlie et a/, 2001). However, 
their successful translation into clinical diagnostic assays has not 
yet been achieved and remains a priority if patients are to benefit 
from the knowledge of the molecular heterogeneity of breast 
cancer. The current assessment of the histological and clinical 
characteristics of tumours fails to identify patients most appro- 
priate for adjuvant systemic therapy. Although adjuvant therapy 
significantly improves breast cancer survival (EBCTCG, 1998; 
Berry et al, 2005), it is generally accepted that a substantial 
proportion of patients who are at low risk of relapse are 
nonetheless receiving adjuvant chemotherapy, hence experience 
the side effects of the treatment without deriving much benefit 
(EBCTCG, 1998; Berry et al, 2006). The translation of the intrinsic 
subtypes of breast cancer into clinical assays may enable us to 
stratify patients by their likelihood of benefiting from adjuvant 
treatment. 

This problem is most serious amongst patients with oestrogen 
receptor (ER) + disease because those with ER — disease are 
known to derive greater absolute benefit of adjuvant chemotherapy 
(EBCTCG, 1998; Berry et al, 2006). Indeed, ER has been proposed 
as a determinant of whether patients should receive chemotherapy 
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(Henderson, 2010; Pritchard, 2011; Regierer et al, 2011), however, 
according to the largest meta-analysis, the proportional risk 
reduction for mortality is not significantly different by ER status 
(EBCTCG, 1998). By gene expression profiling ER+ tumours 
are classified as luminal A or luminal B (Perou et al, 2000). 
Luminal B tumours are defined by the expression of higher levels 
of proliferation-related genes, including MKI67, than luminal A 
tumours (Perou et a/, 2000). Although a proportion of luminal B 
tumours can be distinguished from luminal A tumours by 
detecting amplification of human epidermal growth-factor recep- 
tor 2 (HER2), the remainder are more difficult to identify. Ki67 
expression by immunohistochemistry (IHC) has been used as a 
means of identifying HER2-negative luminal B tumours, success- 
fully defining a subset of ER + cases with poor outcome (Cheang 
et aly 2009). In this case, Ki67 was used as a surrogate tissue-based 
readout of proliferation in order to recapitulate the classification 
originally based on clustering of tumour transcriptomes (Perou 
et aly 2000; Cheang et a/, 2009). That proliferation is a powerful 
prognostic factor in breast cancer is evidenced by its inclusion in 
the assessment of histological grade as mitotic count, which has 
recently been shown to be largely responsible for the prognostic 
value of tumour grade (Abdel-Fatah et a/, 2010). Moreover, the 
prognostic power of multigene predictors in breast cancer has 
been shown to be almost exclusively attributable to proliferation 
and cell-cycle-related genes and limited to ER + breast cancer, because 
ER — cases are nearly always deemed high risk (Teschendorff et al, 
2006; Desmedt et al, 2008; Wirapati et al, 2008). 
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Although MKI67 is invariably included amongst the prolifera- 
tion genes of multigene predictors, there are also other cell-cycle- 
related genes, which have received less attention (Paik et al, 2004; 
Paik, 2007). Assessment of Ki67 expression by IHC holds promise 
as a prognostic and predictive biomarker, however, reports 
have been conflicting and comparison between studies made 
difficult by varying methodologies and cut-points for positivity 
(Urruticoechea et al, 2005; Yerushalmi et al, 2010), indeed 
guidelines have been produced in order to address these limi- 
tations (Dowsett et al, 2011). Although Ki67 is not generally 
used in the routine management of breast cancer, it has recently 
been recommended by the St Gallen consensus committee for 
discriminating between luminal A and luminal B tumours 
(Goldhirsch et al, 2011). Alternative IHC markers of proliferation 
have been proposed and have included those involved in cell-cycle 
control, including cyclin E (CCNEl) (Keyomarsi et al, 2002) and 
those that carry out the function of DNA licensing for replication 
(Gonzalez et al, 2004, 2005). Both mini-chromosome maintenance 
protein 2 (MCM2) and geminin (GMNN), which licence DNA for 
replication and inhibit re-replication of DNA, respectively, have 
been shown to carry prognostic value in breast cancer (Gonzalez 
et al, 2003, 2004). Assessment of a panel of cell-cycle-related 
proteins, including MCM2, has been proposed to differentiate 
actively cycling cells, those in-cycle but with arrested progression 
and those out of cycle, which may provide prognostic information 
in breast cancer (Loddo et al, 2009). Thus in a manner analogous 
to gene signatures of proliferation, measuring multiple prolifera- 
tion-related proteins has been hypothesised to carry greater 
prognostic information than relying on a single marker. 

We compared the prognostic value of a panel of proliferation 
markers measured using IHC in a large cohort of tumours 
represented in tissue microarrays (TMAs). We selected MCM2, 
Ki67, aurora kinase A (AURKA), polo-like kinase 1 (PLKl), GMNN 
and phospho-histone H3 (PHH3) based on their differential 
expression in the phases of cell cycle (Loddo et al, 2009). Our 
aims were to establish the marker of greatest prognostic utility and 
to investigate whether a multi-marker assessment of proliferation 
offered additional prognostic value compared with a single 
marker. 



MATERIALS AND METHODS 
Study population 

The prospective population-based study SEARCH (studies of 
epidemiology and risk factors in cancer heredity) was used for this 
work. This study primarily includes women < 70 years with early 
breast cancer who are identified through the East Anglia Cancer 
Registry. Details of this study have been published previously 
(Lesueur et al, 2005). A total of 3093 patients were included. The 
characteristics of the study cohort are detailed in Table 1. Available 
data included breast cancer-specific mortality, clinical and 
treatment data. Previously generated data on the IHC markers 
ER, progesterone receptor (PR), HER2, cytokeratin 5/6 (CK5/6) 
and epidermal growth-factor receptor (EGFR) were also available 
(Blows et al, 2010). The SEARCH study is approved by the 
Cambridgeshire 4 Research Ethics Committee; all the study 
participants provided written informed consent. 



Tissue microarrays, IHC and scoring 

Each tumour was represented by a single 0.6-mm tissue core in a 
TMA constructed from paraffin-embedded tissue blocks guided 
by haematoxylin and eosin stained slides marked for invasive 
carcinoma, as previously described (Kononen et al, 1998). Tissue 
microarray sections of 3-4 fim thickness were dewaxed in xylene 
and rehydrated through graded alcohols. Immunohistochemistry 



T3.ble 1 Characteristics of the SEARCH study cohort 


SEARCH 


Variable 


Mean age (range) 






52 (24-73) 


Mean follow-up in 


years (range) 




9.2 (0.37-18.6) 


Number of breast 


cancer deaths (%) 




465 (15) 


5-year survival (%) 






90 




Categories 


Number 


Percent 


Age at diagnosis 


<55 


1977 


64 




>55 


1116 


36 




Missing 


0 


0 


Uracle 


1 
1 


6 1 U 


zU 




2 


1290 


42 




3 


793 


26 




Missing 


400 


13 


Node status 


Negative 


1737 


56 




Positive 


1067 


35 




Missing 


289 


9 


Tumour size 


<2cm 


1672 


54 




2-4.9 cm 


1 143 


37 




^5 cm 


101 


3 




Missing 


177 


6 


ER status 


Negative 


588 


19 




Positive 


1772 


57 




Missing 


733 


24 


PR status 


Negative 


670 


22 




Positive 


1692 


55 




Missing 


731 


24 


HER2 status 


Negative 


1973 


64 




Positive 


272 


9 




Missing 


848 


27 


Chemotherapy 


No 


2067 


67 




Yes 


1025 


33 




Missing 


1 


< 1 


Endocrine therapy 


No 


548 


IB 




Yes 


2545 


82 




Missing 


0 


0 



1799 



Abbreviations: ER = oestrogen receptor; HER2 = human epidermal growth-factor 
receptor 2; PR = progesterone receptor 



was conducted using a BondMax Autoimmunostainer (Leica, Bucks, 
UK). Details of reagents and antigen retrieval conditions are 
summarised in Supplementary Table SI. Bound primary antibody 
was detected using a BOND polymer detection kit (Leica) and 
developed with 3-3'-diaminobenzidine. Stained slides were 
inspected for uniformity of staining or assay failure and those 
not considered interpretable were excluded from assessment. The 
Ariol platform (Genetix Limited, Hampshire, UK) was used to scan 
slides and the resulting images were used for scoring. Details of 
scoring systems for all markers are provided in Supplementary 
Table SI. Proliferation markers were scored according to the 
proportion of positive cells only, using an Allred proportion score 
(0 = 0%, 1=<1%, 2=1-10%, 3 = 11-33%, 4 = 34-66% and 
5=>66%). For MCM2, Ki67, GMNN and PHH3, a cell was 
considered positive if there was any nuclear signal above 
background, whereas for AURKA and PLKl, any cell with nuclear 
or cytoplasmic signal above background was deemed positive 
(Figure 1). 
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Figure I Photomicrographs of representative immunostaining for all the proliferation markers. 



Definition of molecular subtype 

In order to investigate the relationship between proHferation 
markers and molecular subtype, a surrogate IHC-based classifier 
was used, as previously described (Blows et al, 2010). Molecular 
subtypes were defined as: luminalla (ER+ or PR + , HER2 — , 
CK5/6- and EGFR-), luminallb (ER+ or PR + , HER2 - , 
CK5/6+ or EGFR + ), luminal2 (ER+ or PR + , HER2 + ), HER2 
(ER — and PR — and HER2 + ), core basal phenotype (ER — and 
PR - , HER2 - , CK5/6 + or EGFR + ) and five-marker negative 
phenotype (ER - , PR - , HER2 - , CK5/6 - and EGFR - ). 



Statistical analyses 

All the analyses were stratified according to the ER status in order 
to account for the fundamental differences between ER+ and 
ER— tumours (Pharoah and Caldas, 2010). This study complied 
with REMARK (reporting recommendations for tumour-marker 
prognostic studies) criteria (McShane et al, 2005). Correlations 
between ordinal variables were made using Spearman's rank 
correlation coefficient. A log-rank test was used to compare 
survival between strata in Kaplan-Meier survival plots. Associa- 
tion with survival was assessed using a Cox proportional-hazards 
model with 10-year breast cancer-specific survival (BCSS) as 
outcome, providing a hazard ratio (HR) and 95% confidence 
interval (CI) for each variable. The date of study entry rather than 
date of diagnosis was used to determine time under observation 
(left-truncation) in order to adjust for unobserved events (Azzato 
et aly 2009). For the analysis of associations with clinical 
characteristics, all the proliferation markers were modelled as 
dichotomous and the significance of associations was tested by 
Pearson's Chi-square test or Fisher's exact test as appropriate. The 
cut-off for dichotomisation was informed by comparing strata 
against non-expressing cases in a Cox proportional-hazards 
model. For survival analyses, proliferation markers were modelled 
as continuous or dichotomous according to the relative fit of 
multivariate models adjusted for the standard prognostic factors, 
assessed using likelihood ratios. The standard log-log plots were 
used to explore compliance with the proportional-hazards 
assumption. Where markers violated the assumption, the Cox 
model was extended to include a coefficient for each time- 
dependent covariate, which varied as a function of log-time, 
indicating the direction and magnitude of change in relative risk 
with time. That is, the log of the coefficient will be > 1 if risk 
increases with time and < 1 if risk decreases with time. The 
P-value of the time varying coefficient was used to determine 
whether to model a covariate as time-dependent in different 
subgroups. The prognostic value of proliferation markers was 
directly compared by including all markers in a Cox model that 
was modified in a backward stepwise manner to identif)^ 
proliferation markers, which carried prognostic value independent 
of each other. These markers were then included in a multivariate 
model with age (>55 years), lymph node status, grade, tumour 
size (<2, 2-4.9 and ^5 cm), endocrine therapy, adjuvant 
chemotherapy, PR and HER2 status. Grade and tumour size were 



modelled as continuous variables. This model was modified in a 
backward stepwise manner until the most parsimonious fit was 
attained. In order to adjust for the inevitable selection bias 
associated with missing data in molecular pathology studies 
(Hoppin et aU 2002), we used multiple imputation (MI) to resolve 
missing values for all the variables included in multivariate models 
including an outcome indicator variable in the model (Moons et al, 
2006), generating 50 data sets as previously described (Ali et al, 
2011). We have recently validated MI as a method of handling 
missing data in molecular pathology prognostic marker studies 
(Ali et aly 2011). Results for survival analyses conducted on 
imputed data represent a combination of analyses for each of the 
50 data sets and are presented alongside the results from analyses 
excluding cases with missing data (complete case analysis) for 
comparison. All statistical analyses were conducted using Inter- 
cooled Stata version 11.1 (StataCorp., College Station, TX, USA). 



RESULTS 

The characteristics of the study cohort are summarised in Table 1. 
There were 465 deaths from breast cancer with 416 occurring 
within 10 years of diagnosis. Excluding cases with missing data, 75% 
of the cohort was ER + , 72% was PR + and 12% was HER2 + . 



Correlations and associations of proliferation markers 

All the proliferation markers were significantly correlated with 
each other and tumour grade in both ER+ and ER— disease 
(Table 2). In ER+ disease, GMNN was most strongly correlated 
with grade with a Spearman's p of 0.31 (P< 0.0001). In ER- 
disease, GMNN and Ki67 were most strongly correlated with grade, 
each with a Spearman's p of 0.39 (P< 0.0001). Correlation 
between proliferation markers was strongest for Ki67 and MCM2 
(Spearman's p = 0.55; P< 0.0001) in ER+ disease, whereas in 
ER — disease, it was Ki67 and GMNN that showed the strongest 
correlation (Spearman's p = 0.59; P<0.0001). These weak to 
moderate correlations between proteins, putatively tracking the 
same biological process, may be explained by the proportion of cell 
cycle during which each protein is expressed. The number of cases 
with higher Allred proportion scores was smaller for proteins 
expressed for a shorter period of cell cycle (Table 3). For example, 
MCM2, which is expressed for the longest period during cell cycle 
of any of the proteins (early and late Gl, G2, S and M), was 
expressed by 13% of cases (after excluding those with missing 
data) in >66% of cells and 36% of cases in >10% of cells. In 
contrast, PHH3, which is expressed for the shortest period during 
cell cycle (M phase only), was expressed by 11% of cases in > 10% 
of cells, with no cases expressing PHH3 in > 66% of cells. 

Proliferation markers were associated with adverse clinical 
characteristics in ER+ disease. Both AURKA and GMNN were 
significantly associated with positive lymph node status (Table 4). 
Of the two, AURKA showed the stronger association with 46% of 
AURKA + cases being lymph node positive compared with 35% of 
AURKA— cases (P< 0.001). All the proliferation markers, except 
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Table 2 Correlation between proliferation markers and grade in ER-positive and ER-negative disease 



ER-positive ER-negative 





Grade 


MCM2 


Ki67 


AURKA 


PLKI 


GMNN 


PHH3 


Grade 


MCM2 


Ki67 


AURKA 


PLKI 


GMNN PHH3 


Grade 


1 












Grade 


1 












MCM2 


0.26 


1 










MCM2 


0.31 


1 










Ki67 


0.27 


0.55 


1 








Ki67 


0.39 


0.54 


1 








AURKA 


0.26 


0.37 


0.39 


1 






AURKA 


0.35 


0.31 


0.48 


1 






PLKI 


0.22 


0.36 


0.32 


0.36 


1 




PLKI 


0.26 


0.37 


0.35 


0.35 


1 




GMNN 


0.31 


0.52 


0.49 


0.46 


0.39 


1 


GMNN 


0.39 


0.52 


0.59 


0.54 


0.36 


1 


PHH3 


0.20 


0.28 


0.28 


0.25 


0.30 


0.34 


1 PHH3 


0.26 


0.25 


0.42 


0.39 


0.25 


0.50 1 



Abbreviations: AURKA = aurora kinase A; ER = oestrogen receptor; GMNN = geminin; MCM2 = mini-chromosome maintenance protein 2; PHH3 = phospho-histone H3; 
PLKI = polo-like kinase I. All correlations were significant at P<0.000l. 



Table 3 Distribution of proliferation marker Allred proportion scores 



Marker 



MCM2 



Ki67 



AURKA 



PLKI 



GMNN 



PHH3 





Variable 




n 


% 


n 


% 


n 


% 


n 


% 


n 


% 


n 


% 


Proportion 


0 


0% 


549 


18 


967 


3! 


669 


22 


643 


2! 


554 


18 


855 


28 


score^ 
































1 


<\7o 


345 


1 1 


278 


9 


732 


24 


406 


13 


57! 


18 


517 


17 




2 


1-10% 


5! 1 


17 


464 


15 


453 


15 


318 


10 


690 


22 


150 


5 




3 


1 1-33% 


294 


10 


447 


14 


122 


4 


88 


3 


277 


9 


17 


1 




4 


34-66% 


22! 


7 


104 


3 


26 


1 


27 


1 


84 


3 


4 


0 




5 


>66% 


286 


9 


57 


2 


4 


0 


7 


0 


3! 


1 


0 


0 




Missing 




887 


29 


776 


25 


1087 


35 


1604 


52 


886 


29 


1550 


50 


Cut-off 






>3 




>2 




> 1 




>2 




>2 




>0 




Status^ 






























Negative 






1699 


77 


1709 


74 


140! 


70 


1367 


92 


1 125 


5! 


855 


55 


Positive 






507 


23 


608 


26 


605 


30 


122 


8 


1082 


49 


688 


45 



Abbreviations: AURKA = aurora kinase A; GMNN = geminin; MCM2 = mini-chromosome maintenance protein 2; PHH3 = phospho-histone H3; PLKI = polo-like kinase I. 
^Allred proportion score (0 = 0%, I = < 1%, 2= 1-10%, 3 = I 1-33%, 4 = 34-66% and 5= >66%). ^Excluding missing cases. Note: Percentages have been rounded to the 
nearest whole number 



PLKI and PHH3, were significantly associated with HER2 
positivity. MCM2 showed the strongest association with 19% of 
MCM2 + cases being HER2 + compared with just 6% of MCM2 - 
cases (P< 0.001). In contrast, in ER— disease, the pattern of 
association was less clear with some indication of an association 
with favourable clinical characteristics (Supplementary Table S2). 
For example, only PLKI was significantly associated with lymph 
node status in ER— cases. However, this association was with 
negative lymph node status with 65% of PLKI + cases being 
lymph node negative compared with 49% of PLKI — cases 
(P = 0.024). Similarly, both AURKA and PLKI showed a negative 
association with HER2 positivity. In all, 81% of AURKA + cases 
were HER2 - compared with 72% of AURKA - cases (P = 0.027) 
and for PLKI, 89% of positive cases were HER2 — compared with 
76% of negative cases (P = 0.037). These findings lend weight to 
the idea that the clinical and biological significance of proliferation 
is different between ER + and ER — tumours. 

Proliferation markers predict poor outcome in ER + 
disease only 

Univariate survival analyses revealed an association between all the 
proliferation markers and poor outcome in ER + but not in ER — 
cases (Table 5 and Supplementary Table S3). For ER+ cases, 
AURKA, GMNN, PHH3 and MCM2 were best modelled as 
continuous variables. Both MCM2 and GMNN showed a reduction 
in hazard with time in both complete and imputed data. Ki67 was 



the only proliferation marker significantly associated with survival 
in ER — disease, with an association of nominal significance when 
imputed data was analysed (HR 1.5; 95% CI 1.0-2.1; P= 0.032) and 
a similar point estimate when cases with missing data were 
excluded (HR 1.3; 95% CI 0.88-1.8; P = 0.195). However, in a 
model adjusted for tumour grade, Ki67 no longer showed an 
association with survival in imputed data of ER— cases (HR 1.3; 
95% CI 0.88-1.9; P = 0.200). 

Aurora kinase A and GMNN carried prognostic value indepen- 
dent of each other in ER+ disease. The prognostic value of 
proliferation markers was compared by multivariate analysis 
including only the proliferation markers as covariates. Both 
AURKA and GMNN retained independent prognostic significance 
in the analyses of complete and imputed data (Table 6, Model 1). 
This finding supports the hypothesis that different markers of 
proliferation carry distinct prognostic information by better 
reflecting the phases of cell cycle (Gonzalez et a/, 2004, 2005; 
Williams and Stoeber, 2007; Loddo et al, 2009). Although MCM2 
was also retained in the multivariate model of complete data, this 
association was not recapitulated when the imputed data was 
analysed. Ki67 did not provide prognostic information indepen- 
dent of all the other proliferation markers. 

Aurora kinase A carried prognostic information independent of 
major clinical and molecular characteristics on multivariate 
analysis of ER + disease (Table 6, Model 2). There were 88 deaths 
from breast cancer in the multivariate model of complete data. The 
increase in relative risk of event was 40% and 30% for complete 
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Table 4 Associations of proliferation markers with clinical characteristics in ER-positive disease 



ER positive 



MCM2 



Ki67 



AURKA 



PLKI 



GMNN 



PHH3 



Variable Negative Positive Negative Positive Negative Positive Negative Positive Negative Positive Negative Positive 



Age at diagnosis 

<55 

>55 
P-value 

Tumour type 

Ductal 

Lobular 

Other 
P-value 



776 (63) 152 (61) 
462 (37) 96 (39) 
0.680 



897 (72) 1 89 (77) 
219(18) 30(12) 
122 (10) 28 (11) 
0.096 



794 (62) I' 
493 (38) I 
0.925 



908 (71) 2 
245 (19) 
134 (10) 

0.131 



(62) 


637 


(62) 21 2 


(65) 


564 


(60) 


25 


(60) 


486 


(60) 427 (64) 


362 


(59) 


(38) 


395 


(38) 1 1 5 


(35) 


374 


(40) 


17 


(40) 


319 


(40) 236 (36) 


251 


(41) 






0.312 






0.938 








0.1 13 




0.- 


(76) 


718 


(70) 266 


(81) 


673 


(72) 


34 


(81) 


536 


(67) 520 (79) 


439 


(72) 


(15) 


197 


(19) 29 


(9) 


168 


(18) 


1 


(3) 


178 


(22) 81 (12) 


109 


(18) 


(8) 


1 17 


(II) 32 


(10) 


97 


(10) 


7 


(17) 


91 


(II) 61(9) 


65 


(II) 



61 (15) 
35 (9) 



< 0.00 1 



0.008^ 



< 0.00 1 



0.301 



Grade 
I 

2 
3 

P-value 

Node status 
Negative 
Positive 

P-value 

Tumour size 
<2cm 
2-4.9 cm 
^5 cm 



316(29) 19(8) 
601 (55) I 17 (51) 
176 (16) 93 (41) 
< 0.00 1 



723 (63) 141 (60) 
43 1 (37) 95 (40) 
0.402 



71 I (59) 144 (60) 
457 (38) 9 1 (38) 
30 (3) 5 (2) 

0.922 



338 (30) 2 
624 (55) 1 5 
166(15) II 
< 0.00 1 



750 (63) r 
444 (37) I : 
0.1 15 



741 (60) 
469 (38) 
32 (3) 

0.287 



(9) 


268 


(29) 31 


(II) 


199 


(24) 


4 


(10) 


239 


(52) 


524 


(57) 138 


(48) 


488 


(58) 


14 


(35) 


395 


(39) 


129 


(14) 120 


(42) 


153 


(18) 


22 


(55) 


73 






< 0.001 






<o.oor 






(58) 


626 


(65) 1 65 


(54) 


538 


(61) 


21 


(55) 


497 


(42) 


333 


(35) 140 


(46) 


342 


(39) 


17 


(45) 


263 






< 0.001 






0.4^ 


^8 






(55) 


603 


(60) 170 


(54) 


524 


(58) 


21 


(51) 


468 


(42) 


373 


(37) 136 


(43) 


360 


(40) 


20 


(49) 


281 


(3) 


23 


(2) 9 


(3) 


23 


(2) 


0 


(0) 


23 



< 0.00 1 



0.022 



0.131 



0.477^ 



(61) 369 (58) 



0.252 



157 (28) 59 (17) 
322 (58) 194 (55) 
80 (14) 101 (29) 
< 0.00 1 



369 (64) 2 1 2 (59) 
210 (36) 147 (41) 
0.152 



357 (60) 2 1 6 (57) 
225 (38) 153 (40) 
11(2) 12(3) 
0.297 



PR status 
Negative 
Positive 

P-value 



133 (11) 35 (14) 
1076 (89) 209 (86) 
0.136 



137 (11) 47 (15) 
1117 (89) 263 (85) 
0.038 



112(11) 47(15) 
898 (89) 276 (85) 
0.095 



116(13) 2(5) 
805 (87) 39 (95) 
0.219^ 



97 (12) 75 (12) 
685 (88) 576 (88) 
0.608 



62 (10) 51 (13) 
535 (90) 343 (87) 
0.215 



HER2 status 
Negative 
Positive 

P-value 



1085 (94) 184 (81) 
75 (6) 44 (19) 
< 0.00 1 



1144 (94) 248 (82) 
75 (6) 53 (18) 
< 0.00 1 



907 (94) 253 (84) 
59 (6) 49 (16) 

< 0.00 1 



797 (91) 35 (90) 
79(9) 4(10) 
0.774^ 



703 (96) 539 (86) 
30 (4) 87 (14) 
< 0.00 1 



523 (93) 343 (91) 
41 (7) 34 (9) 
0.332 



Abbreviations: AURKA = aurora kinase A; ER = oestrogen receptor GMNN = geminin; HER2 = human epidermal growth-factor receptor 2; MCM2 = mini-chromosome maintenance 
protein 2; PHH3 = phospho-histone H3; PLKI = polo-like knase I ; PR= progesterone receptor ^Fishers exact test. Note: Percentages have been rounded to the nearest whole number 



Table 5 Univariate analysis in ER-positive disease 



ER positive 



Variable 




Complete case 


aanalysis 




Multiple 


imputation (Al = 50) 




n 


HR (95% CI) 


P 


T (95% CI) 


P 


n HR (95% CI) 


P 


T (95% CI) 


P 


Grade^ 


1560 


2.3 (1.8-2.9) 


< 0.001 


NA 




5.7 (2.8-1 1.5) 


< 0.001 


0.52 (0.34-0.81) 


0.004 


Tumour size^ 


1705 


2.5 (1.9-3.1) 


< 0.001 


NA 




2.4 (1.9-2.9) 


< 0.001 


NA 




Node Status 


1637 


3.9 (2.8-5.4) 


< 0.001 


NA 




3.4 (2.6-4.5) 


< 0.001 


NA 




Endocrine therapy 


1771 


0.22 (0.05-0.90) 


0.036 


2.4 (0.95-5.8) 


0.063 


0.30 (0.09-1.0) 


0.054 


2.0 (0.93-4.4) 


0.074 


Chemotherapy 


1771 


6.2 (2.3-16.8) 


< 0.001 


0.52 (0.28-0.98) 


0.044 


8.6 (3.6-20.6) 


< 0.001 


0.45 (0.26-0.78) 


0.004 


PR 


1710 


0.49 (0.34-0.70) 


< 0.001 


NA 




0.51 (0.36-0.72) 


< 0.001 


NA 




HER2 


1594 


2.5 (1.7-3.7) 


< 0.001 


NA 




2237 2.3 (1.6-3.4) 


< 0.001 


NA 




MCM2^ 


1485 


1.6 (1.2-2.3) 


0.005 


0.79 (0.64-0.98) 


0.032 


1.5 (1.1-2.0) 


0.008 


0.83 (0.69-1.0) 


0.052 


Ki67 


1599 


1.8 (1.3-2.5) 


< 0.001 


NA 




1.9 (1.4-2.5) 


< 0.001 


NA 




AURKA^ 


1358 


1.6 (1.3-1.9) 


< 0.001 


NA 




1.5 (1.2-1.7) 


< 0.001 


NA 




PLKI 


979 


2.6 (1.3-5.2) 


0.007 


NA 




1.7 (0.96-3.0) 


0.071 


NA 




GMNN^ 


1467 


2.7 (1.6-4.4) 


< 0.001 


0.65 (0.47-0.89) 


0.008 


2.0 (1.4-3.0) 


< 0.001 


0.74 (0.58-0.96) 


0.021 


PHH3^ 


1012 


1.5 (1. 1-1.9) 


0.004 


NA 




1.3 (1. 1-1.6) 


0.016 


NA 





Abbreviations: AURKA = aurora knase A; Gl = confidence interval; ER = oestrogen receptor; GMNN =geminin; HER2 = human epidermal growth-factor receptor 2; 
HR= hazard ratio; MGM2 = mini-chromosome maintenance protein 2; PHH3 = phospho-histone H3; PLKI = polo-like kinase I; PR = progesterone receptor NA = not 
available. ^Modelled as a continuous variable. 
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Table 6 Multivariate analysis of proliferation markers in ER-positive disease indicating independent prognostic value of AURKA (bold) 



Variable 




Complete case 


analysis 




Multipl 


e imputation (M = 50) 




n 


HR (95% CI) 


P 


T (95% CI) 


P n 


HR (95% CI) 


P 


T (95% CI) 


P 


Model 1 


589 








TBI 










MCM2^ 




0.43 (0.23-0.82) 


0.01 1 


L6 (LI-2.4) 


0.024 










AURKA" 




1 .6 (1 . 1-2.2) 


0.010 


NA 




L3 (LI-L6) 


0.004 


NA 




GMNN" 




4.0 (1.4-1 1.7) 


0.01 1 


0.51 (0.26-0.99) 


0.047 


L8 (L2-2.7) 


0.004 


0.75 (0.59-0.96) 


0.023 


Model 2 


884 








2237 










Grade" 




1 .4 (0.97-2.0) 


0.077 


NA 




3.9 (L9-8.0) 


< 0.001 


0.55 (0.35-0.86) 


0.008 


Node status 




2.9 (1.8-4.6) 


< 0.001 


NA 




2.6 (2.0-3.5) 


< 0.001 


NA 




Tumour size 




1.6 (1.1-2.4) 


0.018 


NA 




L8 (L4-2.2) 


< 0.001 


NA 




PR 




0.51 (0.31-0.85) 


0.010 


NA 




0.54 (0.38-0.75) 


< 0.001 


NA 




HER2 




2.4 (1.4-4.1) 


0.001 


NA 




L7 (LI-2.5) 


0.012 


NA 




AURKA" 




1.4 (1. 1-1.7) 


0.004 


NA 




1.3 (1. 1-1.5) 


0.005 


NA 




Abbreviations: 


AURKA = 


aurora kinase A; CI = 


confidence 


interval; ER = oestrogen 


receptor GMNN = 


geminin; HER2 = 


human epiderma 


growth-factor receptor 2; 



HR= hazard ratio; MCM2 = mini-chromosome maintenance protein 2; PHH3 = phospho-histone H3; PLKI = polo-like kinase I; PR = progesterone receptor "Modelled as a 
continuous variable. 
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Figure 2 Kaplan-Meier survival plot of AURKA scores in ER+ disease. 
AURKA expression as Allred proportion scores (0-4, because there were 
no ER+ cases with a score of 5) (Log-rank < 0.000 1). 

and imputed data, respectively (Adjusted 10-year BOSS for 
AURKA were 0 = 93%, 1=90%, 2 = 88%, 3 = 58% and 4 = 
insufficient sample size). Because AURKA was best modelled as 
a continuous variable, these data represent the increase per 
increment of Allred score (Figure 2). Multivariate analysis of the 
same model with AURKA replaced by Ki67 showed that Ki67 also 
retained independent prognostic significance in imputed data 
(HR 1.4; 95% CI 1.0-1.9; P = 0.053) with the same point estimate 
for complete data (HR 1.4; 95% CI 0.97-2.1; P = 0.070). However, 
for the same model, Ki67 was no longer associated with survival in 
the presence of AURKA either in complete (HR 1.1; 95% CI 
0.66-1.7; P = 0.828) or imputed (HR 1.2; 95% CI 0.84-1.6; 
P = 0.346) data. In contrast, in this model including Ki67, AURKA 
retained independent prognostic value in both complete (HR 1.3; 
95% CI 1.0-1.6; P = 0.017) and imputed data (HR 1.2; 95% CI 
1.0-1.4; P = 0.023), confirming that AURKA outperforms Ki67 as a 
prognostic marker in ER+ breast cancer. Although AURKA 
expression was correlated with tumour grade, the relationship 
with luminal molecular subtypes was less pronounced (Figure 3). 
This implies that, in addition to CK5/6, EGFR and HER2, AURKA 
could be used to refine the distinction between luminal subtypes. 



DISCUSSION 

Proliferation has emerged as a robust prognostic factor in ER + 
breast cancer (Desmedt et a/, 2008; Stuart-Harris et a/, 2008; 
Wirapati et al, 2008). Although mitotic count contributes to 
tumour grade, additional measures of proliferation have been 
shown to add prognostic value independent of grade 
(Aleskandarany et al, 2011). Of these, Ki67 labelling by IHC has 
been most widely investigated (Urruticoechea et al, 2005; Cheang 
et al, 2009; Colozza et al, 2010; Yerushalmi et al, 2010). However, 
other promising proliferation-related proteins have received less 
attention as potential prognostic markers (Gonzalez et al, 2003, 
2004, 2005; Loddo et al, 2009). We have compared the prognostic 
utility of a panel of proliferation-related proteins, including Ki67, 
in a large cohort of primary invasive breast tumours. We confirm 
that proliferation markers are significantly associated with survival 
in ER+ disease only and find that AURKA carries the greatest 
prognostic value outperforming Ki67 and serving as an indepen- 
dent prognostic factor in ER + breast cancer. 

This study has some limitations. First, our conclusions require 
validation in an independent cohort, even though we have 
employed a large study cohort (3093 cases) lending statistical 
robustness to our findings, which are also are in keeping with 
previous reports (Nadler et al, 2008; Loddo et al, 2009). Second, 
we have used TMAs to represent tumours. Although excellent 
concordance between TMAs and full-face sections has been 
reported (Callagy et al, 2003; Ruiz et al, 2006), further evaluation 
of AURKA as a clinical assay would require use of full-face 
sections. Finally, we have not assessed the predictive value of 
AURKA's in this observational study, as this would be best 
addressed in the context of a randomised clinical trial. However, 
our data support the ability of AURKA to predict absolute benefit 
of adjuvant systemic therapy, highlighting the potential clinical 
utility of AURKA. 

Prognostic classifiers based on the assessment of tens of genes 
have followed seminal studies of breast tumour transcriptomes 
(Perou et al, 2000; Sorlie et al, 2001; Paik et al, 2004; Teschendorff 
et al, 2006). The prognostic power of these classifiers has been 
shown to heavily rely on proliferation-related genes (Desmedt 
et al, 2008; Wirapati et al, 2008). These classifiers utilise several 
correlated genes to produce a readout of proliferation. Similarly, a 
panel of IHC proliferation markers has been proposed to show 
greater prognostic significance than a single marker (Gonzalez 
et al, 2005; Williams and Stoeber, 2007, 2012; Loddo et al, 2009). 
The basis of this additional value has been argued to relate to the 
integration of DNA-licensing markers and markers of actively 
cycling cells in order to gauge the 'rate' of proliferation in a 
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Figure 3 Bar charts illustrating the relationship between AURKA and (A) grade (B) and molecular subtype in ER-positive disease. 



given tumour (Gonzalez et al, 2005; Williams and Stoeber, 2007, 
2012). The analysis of a panel of cell-cycle-related proteins can 
identify distinct cell-cycle phenotypes both at the level of single 
cells (Endl et a/, 2001; Shetty et al, 2005) and cell populations 
(Loddo et al, 2009). Indeed, DNA-licensing factors, particularly 
MCMs, have been shown to be powerful predictors of clinical 
outcome in several solid tumours including prostate, lung, 
kidney, breast and ovary (Meng et a/, 2001; Ramnath et a/, 2001; 
Gonzalez et al, 2003, 2004; Dudderidge et al, 2005; Kulkarni 
et a/, 2007). Three cell-cycle phenotypes can be identified by 
integrating markers of cell-cycle progression, they approximate 
to (1) an 'out-of-cycle' or differentiated state defined by the lack 
of expression of cell-cycle proteins, including MCMs, and that 
may express markers of 'differentiated' cells, including inhibi- 
tors of cyclin-dependent kinases such as p27, (2) a 'Gl -delayed/ 
arreseted' or growth-arrested state defined by the expression of 
an MCM, hence DNA is 'licensed' for replication, but lacking 
expression of mitotic kinases including PLKl and AURKA 
or other markers of actively proliferating cells including Ki67 or 
GMNN and (3) 'accelerated cell cycle' or actively proliferating 
state defined by the expression of both MCMs and proteins 
expressed after the cell-cycle restriction point including 
AURKA, GMNN, PLKl, PHH3 and Ki67 (Endl et al 2001; 
Dudderidge et aU 2005; Williams and Stoeber, 2007; Loddo et a/, 
2009). A scheme for determining cell-cycle phenotype in this 
way holds particular promise as a predictive biomarker by 
identifying tumours sensitive to cell-cycle phase-specific che- 
motherapeutic agents (Williams and Stoeber, 2007, 2012; Loddo 
et aly 2009). We addressed the hypothesis that multi-parameter 
estimates of cell-cycle phenotype would outperform single- 
marker assays as predictors of outcome by including markers 
expressed differentially during cell cycle in a multivariate 
analysis. We found that GMNN and AURKA indeed provided 
independent prognostic information. However, subsequent 
analysis in a model adjusted for the standard clinical variables 
showed only AURKA retained independent prognostic value. 
This may arise as a result of our assessing protein expression as 
a proportion of a population of cancer cells separately for each 
cell-cycle marker and subsequently comparing these scores in a 
multivariate model. This cell-population approach may not 
identify the proposed cell phenotypes, particularly growth- 
arrested cells, with adequate sensitivity. A multiplexed single- 
cell assay, which determines the proportion of cells in each of 
the three phenotypes per tumour, may overcome this limitation, 
especially if combined with the sophisticated methods 
of automated image analysis (Camp et al, 2002; Williams and 
Stoeber, 2012). 



Aurora kinase A is among the proliferation genes that contribute 
to the 21 -gene recurrence score (Paik et al, 2004). Aurora kinase A 
is required for proper centrosome function and for mitotic spindle 
assembly (Lens et al, 2010). As a protein, which functions 
specifically during mitosis, AURKA also represents an attractive 
drug target and several AURKA inhibitors are under development 
(Keen and Taylor, 2004; Lens et al, 2010). The basis of the superior 
prognostic performance of AURKA compared with the other 
proliferation markers is not clear and is likely to relate to many 
variables including biological function, assay differences and ease 
of interpretation. Aurora kinase A was one of the proliferation 
markers, best modelled as a continuous variable. This is consistent 
with the idea that luminal tumours form a continuum according to 
the expression levels of proliferation-related genes and that their 
division into two subgroups is somewhat arbitrary (Desmedt et al, 
2008; Wirapati et al, 2008; Colombo et al, 2011). In this respect, 
AURKA labelling by IHC could be used as a means of better 
reflecting this diversity in clinical practice. Moreover, the 
prognostic utility of AURKA may be increased by including it in 
a combined index with B-cell lymphoma protein 2, just as we have 
recently shown for Ki67 (Ali et a/, 2012). Moreover, AURKA gene 
expression has recently been used as a prototypical proliferation 
marker in a three-gene classifier for the molecular subtyping of 
breast cancer shown to be more statistically robust than other 
methods (Haibe-Kains et al, 2012). 

In summary, we have conducted a large head-to-head comparison 
of the prognostic value of a panel of proliferation markers in primary 
breast cancer. We have used IHC and a scoring system used routinely 
in clinical practice to show that the prognostic significance of 
proliferation is limited to ER + disease and that AURKA outperforms 
other proliferation markers including Ki67. Aurora kinase A defines 
five subgroups in ER+ breast cancer and carries independent 
prognostic significance in multivariate analysis. Our findings show 
that Ki67 may not be the optimal IHC marker of proliferation and 
warrant further studies addressing the predictive value of AURKA. 
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